{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Estimating Fault Tolerant Resources for Periodic Systems\n",
    "\n",
    "The resource estimation code provided in this module relies on `pyscf` to compute the required symmetry adapted molecular orbitals, one- and two-electron integrals, and correlated wavefunction calculations to determine the accuracy of different truncation schemes. So, to start the tutorial let's run a periodic restricted Hartree-Fock (RHF) simulation of diamond. \n",
    "\n",
    "We assume a knowledge of electronic structure theory for solids, and in particular an understanding that any results need to be converged to the infinite $k$-point (thermodynamic) and complete basis set limit. For the purposes of this tutorial we will focus on results far away from this limit and simulate simple systems in minimal basis sets use very small $k$-point meshes. The module also assumes the use of density fitted integrals and we use range-separated density fitting throughout. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.rcParams['figure.dpi'] = 130 \n",
    "\n",
    "from ase.build import bulk\n",
    "import numpy as np\n",
    "\n",
    "from pyscf.pbc import gto, scf\n",
    "from pyscf.pbc.tools import pyscf_ase\n",
    "\n",
    "\n",
    "# Build a 2 atom unit cell for carbon in the diamond structure near it's\n",
    "# equilibrium lattice constant.\n",
    "ase_atom = bulk(\"C\", \"diamond\", a=3.5)\n",
    "cell = gto.Cell()\n",
    "cell.atom = pyscf_ase.ase_atoms_to_pyscf(ase_atom)\n",
    "cell.a = ase_atom.cell[:].copy()\n",
    "# Using a minimal basis set for expediency.\n",
    "cell.basis = \"gth-szv\"\n",
    "cell.pseudo = \"gth-hf-rev\"\n",
    "cell.verbose = 0\n",
    "cell.build()\n",
    "\n",
    "# We are using a very small k-point mesh for speed purposes too.\n",
    "kmesh = [1, 1, 3]\n",
    "kpts = cell.make_kpts(kmesh)\n",
    "num_kpts = len(kpts)\n",
    "mf = scf.KRHF(cell, kpts).rs_density_fit()\n",
    "mf.kernel()\n",
    "print(\"SCF energy: \", mf.e_tot)\n",
    "\n",
    "# converged SCF energy: -10.39193609748544"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Armed with our SCF solution we can now generate the one and two-electron integrals required to compute $\\lambda$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openfermion.resource_estimates.pbc.hamiltonian import build_hamiltonian\n",
    "# Get molecular orbital \"Cholesky\" integrals from RSGDF object, these are just\n",
    "# 3-centre integrals (X|pq).\n",
    "hcore_mo, chol = build_hamiltonian(mf)\n",
    "print(\"(nkpts, nmo, nmo) = {}\".format(hcore_mo.shape))\n",
    "print(\"(nkpts, nkpts) = {}\".format(chol.shape))\n",
    "print(\"(naux, nmo, nmo) = {}\".format(chol[0,0].shape))\n",
    "num_mo = hcore_mo.shape[-1]\n",
    "num_aux = chol[0,0].shape[0]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hamiltonian Representation\n",
    "The resource estimation module provides four representations for the Hamiltonian: sparse, single-factorization (SF), double-factorization (SF), and tensor hypercontraction (THC). Each of these approaches introduces a different parameter which controls the accuracy of the factorization, and results need to be monitored with respect to these truncation parameters.\n",
    "\n",
    "## Sparse Hamiltonian\n",
    "\n",
    "The sparse Hamiltonian takes the usual form for a second-quantized $k$-point dependent Hamiltonian\n",
    "\n",
    "$$\n",
    "\n",
    "H = \\sum_{pq\\mathrm{k}} h_{pq}(\\mathrm{k}) a_{p\\mathbf{k}}^{\\dagger} a_{q\\mathbf{k}} + \\frac{1}{2} \\sum_{\\mathbf{k}_p\\mathbf{k}_q\\mathbf{k}_r\\mathbf{k}_s}\\sum_{pqrs} (p\\mathbf{k}_pq\\mathbf{k}_q|r\\mathbf{k}_rs\\mathbf{k}_s)  a_{p\\mathbf{k}_p}^{\\dagger} a_{r\\mathbf{k}_r}^{\\dagger} a_{s\\mathbf{k}_s}a_{q\\mathbf{k}_q} \n",
    "\n",
    "$$\n",
    "\n",
    "Utilizing conservation of crystal momentum $\\mathbf{k}_p + \\mathbf{k}_r - \\mathbf{k}_q -\\mathbf{k}_s = \\mathbf{G}$, where $\\mathbf{G}$ is a reciprocal lattice vector, we can write \n",
    "\n",
    "$$\n",
    "\n",
    "H = \\sum_{pq\\mathrm{k}} h_{pq}(\\mathrm{k}) a_{p\\mathbf{k}}^{\\dagger} a_{q\\mathbf{k}} + \\frac{1}{2} \\sum_{\\mathbf{Q}\\mathbf{k}\\mathbf{k}'}\\sum_{pqrs} (p\\mathbf{k}q\\mathbf{k}-\\mathbf{Q}|r\\mathbf{k}'-\\mathbf{Q}s\\mathbf{k}')  a_{p\\mathbf{k}}^{\\dagger} a_{r\\mathbf{k}'-\\mathbf{Q}}^{\\dagger} a_{s\\mathbf{k}'}a_{q\\mathbf{k}-\\mathbf{Q}} \n",
    "\n",
    "$$\n",
    "\n",
    "where $\\mathbf{Q}$ is the momentum transfer vector which we chose to live in our set of $k$-points. Note the subtraction in the above expression is really modulo a $\\mathbf{G}$ vector.\n",
    "\n",
    "The Hamiltonian above has $N_k^3 N^4$ terms, where $N_k$ is the number of $k$-points and $N$ is the number of spin orbitals. The sparse representation attempts to approximate the Hamiltonian by zeroing elements of $H$ which are below some threshold. This will yield $\\mathcal{O}(s N_k^3 N^4)$ terms where $s$ is a sparsity factor.\n",
    "\n",
    "We provide helper functions that will sparsify the Hamiltonian to aid in resource estimation:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openfermion.resource_estimates.pbc import sparse\n",
    "sparse_ham = sparse.SparseFactorization(chol, mf, threshold=1e-3)\n",
    "# look at eri block\n",
    "kpts = [0]*4\n",
    "eri_approx = sparse_ham.get_eri(kpts)\n",
    "eri_exact = sparse_ham.get_eri_exact(kpts)\n",
    "# With a sparsity threshold of 1e-3, the approximate eris and \"exact\" eris (i.e.\n",
    "# those with no truncation)  should yield different results.\n",
    "assert not np.allclose(eri_approx, eri_exact)\n",
    "print(\"Total number of elements N_k^3 N^4 = {:d}\".format(num_kpts**3*num_mo**4))\n",
    "print(\"number of symmetry unique non zero = {}\".format(sparse_ham.get_total_unique_terms_above_thresh()))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the Hamiltonian at hand we can compute the 1-norm (called $\\lambda$) of the Hamiltonian which is essential for computing resource estimates:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sparse_lambda = sparse.compute_lambda(hcore_mo, sparse_ham)\n",
    "print(sparse_lambda)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally we can compute the total resource costs. In particular the following code will compute the number of Toffoli gates required for a single step of phase estimation (`toffolis_per_step`), the total Toffoli count (`total_toffolis`), and the number of logical qubits (`logical_qubits`).\n",
    "\n",
    "The total Toffoli count is given by the toffoli per step cost multiplied by the factor \n",
    "\n",
    "$$\n",
    "\n",
    "\\left\\lceil \\frac{\\pi \\lambda}{2 \\epsilon_{\\mathrm{QPE}}} \\right\\rceil\n",
    "\n",
    "$$\n",
    "\n",
    "where $\\epsilon_{\\mathrm{QPE}}$ is our variable `dE_for_qpe`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_spin_orbs = 2*num_mo\n",
    "resources = sparse.compute_cost(num_spin_orbs, sparse_lambda.lambda_total, sparse_lambda.num_sym_unique , kmesh, dE_for_qpe=0.0016)\n",
    "print(resources)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see there are the following steps required:\n",
    "\n",
    "1. Run an SCF calculation.\n",
    "2. Generate the one- and two-electron matrix elements.\n",
    "3. Compute the lambda value of the Hamiltonian. \n",
    "4. Compute the resource estimates.\n",
    "\n",
    "These steps are required for all four factorizations and involve a lot of boilerplate code. As such we provide utility functions which perform these necessary steps, and scan over the value of the corresponding threshold parameter computing either the MP2 or CCSD correlation energy for each threshold value.\n",
    "\n",
    "Let's see how this works for the sparse factorization first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "thresholds = np.logspace(-1, -5, 5)\n",
    "sparse_costing_table = sparse.generate_costing_table(mf, thresholds=thresholds)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The costing table is a `pandas.DataFrame` which can be convenient for saving to `csv` `(to_csv)`, string `(to_sting())` or outputting results to $\\LaTeX$ (`to_latex()`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(sparse_costing_table.to_string(index=False))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A crucial step in resource estimation is determining when the Hamiltonian is sufficiently accurate given a certain truncation. One way to check this is to use a correlated wavefunction method and monitor the convergence of the correlation energy with the truncation parameter. Let's look at the convergence of the MP2 error with the sparsity threshold. Note that MP2 is not expected to be a particularly faithful model to monitor convergence as it only requires a subset of the integral blocks. CCSD is a better option but comes with a considerable overhead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(sparse_costing_table.cutoff, np.abs(sparse_costing_table.approx_energy-sparse_costing_table.exact_energy), marker=\"o\")\n",
    "plt.xscale(\"log\")\n",
    "plt.yscale(\"log\")\n",
    "plt.xlabel(\"sparse threshold\")\n",
    "plt.ylabel(\"MP2 Energy Error (Ha)\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(sparse_costing_table.cutoff, np.abs(sparse_costing_table.lambda_total), marker=\"o\")\n",
    "plt.xscale(\"log\")\n",
    "plt.xlabel(\"sparse threshold\")\n",
    "plt.ylabel(\"$\\lambda$\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(sparse_costing_table.cutoff, np.abs(sparse_costing_table.toffolis_per_step), marker=\"o\")\n",
    "plt.xscale(\"log\")\n",
    "plt.xlabel(\"sparse threshold\")\n",
    "plt.ylabel(\"Toffolis per step\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Single Factorization \n",
    "\n",
    "The single factorization approach follows from the usual density fitting or Cholesky factorization of the two-electron integrals:\n",
    "\n",
    "$$\n",
    "\n",
    "(p\\mathbf{k}q\\mathbf{k}-\\mathbf{Q}|r\\mathbf{k}'-\\mathbf{Q}s\\mathbf{k}') = \\sum_n^{M} L_{p\\mathbf{k}q\\mathbf{k}-\\mathbf{Q}}^n L_{s\\mathbf{k}'r\\mathbf{k}'-\\mathbf{Q}}^{n*} \n",
    "\n",
    "\n",
    "$$\n",
    "where $M$ is the dimension of an auxiliary index $n$. With this factorization we can define \n",
    "\n",
    "$$\n",
    "\n",
    "\\hat{\\rho}_{n}(\\mathbf{Q}, \\mathbf{k}) = \\left(\\sum_{\\sigma \\in \\{\\uparrow, \\downarrow\\}}\\sum_{pq}^{N/2}L_{p \\mathbf{k} q (\\mathbf{k}-\\mathbf{Q}), n} a_{p\\mathbf{k}\\sigma}^\\dagger a_{q(\\mathbf{k}-\\mathbf{Q})\\sigma}  \\right)  ,\\qquad \\hat{\\rho}^\\dagger_{n}(\\mathbf{Q}, \\mathbf{k}) = \\left( \\sum_{\\sigma \\in \\{\\uparrow, \\downarrow\\}} \\sum_{pq}^{N/2}L^{*}_{p \\mathbf{k} q (\\mathbf{k}-\\mathbf{Q}), n} a_{q(\\mathbf{k}-\\mathbf{Q})\\sigma}^\\dagger a_{p\\mathbf{k}\\sigma}\\right) \n",
    "\n",
    "$$\n",
    "\n",
    "to write\n",
    "\n",
    "$$\n",
    "\n",
    "\\hat{H}_2 = \\frac{1}{2}  \\sum_{\\mathbf{Q}}^{N_{k}}\\sum_{n}^{M} \\left(\\hat{A}^2_{n}(\\mathbf{Q}) + \\hat{B}^2_{n}(\\mathbf{Q})\\right) + \\mathrm{one\\ body\\ term},\n",
    "\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "$$\n",
    "\n",
    "\\hat{A}_{n}(\\mathbf{Q}) =\\frac{1}{2}(\\hat{\\rho}_{n}(\\mathbf{Q}) + \\hat{\\rho}^\\dagger_{n}(\\mathbf{Q})),\\\\\n",
    "\\hat{B}_{n}(\\mathbf{Q}) = \\frac{i}{2}(\\hat{\\rho}_{n}(\\mathbf{Q}) - \\hat{\\rho}^\\dagger_{n}(\\mathbf{Q})).\n",
    "\n",
    "$$\n",
    "\n",
    "Given this representation of the Hamiltonian it remains to check how rapidly the factorization converges with $M$ typically called `num_aux` below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openfermion.resource_estimates.pbc import sf \n",
    "cutoffs = np.arange(10, num_aux, 10)\n",
    "sf_costing_table = sf.generate_costing_table(mf, naux_cutoffs=cutoffs)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(sf_costing_table.to_string(index=None))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(sf_costing_table.cutoff, np.abs(sf_costing_table.approx_energy-sf_costing_table.exact_energy), marker=\"o\")\n",
    "plt.yscale(\"log\")\n",
    "plt.xlabel(\"$M$\")\n",
    "plt.ylabel(\"MP2 Energy Error (Ha)\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## Double Factorization \n",
    "\n",
    "The double factorization follows on from the single-factorized Hamiltonian, and one performs a further factorization on the $A$ and $B$ operators defined above. \n",
    "\n",
    "$$\n",
    "\n",
    "\\hat{A}_{n}(\\mathbf{Q}) = \\sum_{\\mathbf{k}} \\left[ U^A_{n}(\\mathbf{Q}, \\mathbf{k}) \\left( \\sum_{\\sigma}\\sum_{p}^{\\Xi_{\\mathbf{Q}, n,\\mathbf{k},A}}f^A_{p}(\\mathbf{Q}, n, \\mathbf{k})n_{p\\mathbf{k}\\sigma} \\right) U^A_{n}(\\mathbf{Q}, \\mathbf{k})^{\\dagger} \\right]\n",
    "\n",
    "$$\n",
    "where $U^A$ is a unitary that diagonalizes $A$, $f_p$ are the corresponding eigenvalues, and $n_{p\\mathbf{k}}$ is the number operator. Note that the sum over $\\mathbf{k}$ is factored out of the expression which differs from the single factorized case and is crucial to observe a $\\sqrt{N_k}$ speedup in the quantum algorithm. We thus need to monitor convergence with respect to $M$ and the number of eigenvalues $\\sum_{\\mathbf{Q},n,\\mathbf{k}}\\Xi_{\\mathbf{Q}, n,\\mathbf{k},A}$, typically called `num_eigs` below. Note that we fix $M$ to be the full rank and just monitor convergence w.r.t the second factorization for simplicity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openfermion.resource_estimates.pbc import df\n",
    "cutoffs = np.logspace(-1, -5, 5)\n",
    "df_costing_table = df.generate_costing_table(mf, cutoffs=cutoffs)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(df_costing_table.to_string(index=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.rcParams['figure.dpi'] = 130 \n",
    "plt.plot(df_costing_table.cutoff, np.abs(df_costing_table.approx_energy-df_costing_table.exact_energy), marker=\"o\")\n",
    "plt.yscale(\"log\")\n",
    "plt.xscale(\"log\")\n",
    "plt.xlabel(\"eigenvalue cutoff\")\n",
    "plt.ylabel(\"MP2 Energy Error (Ha)\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tensor hypercontraction factorization\n",
    "\n",
    "The tensor hypercontraction factorization (THC) is somewhat involved and we refer the reader to the [isdf_notebook](./isdf.ipynb) for further details, however the procedure follows much like before (except it is considerably more expensive to generate the factorization). For expediency we will forego reoptimizing the THC factors in the example below and just look at the ISDF convergence. This may take a few minutes to run."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openfermion.resource_estimates.pbc import thc \n",
    "thc_rank_params = [2, 4, 6, 8] \n",
    "thc_costing_table = thc.generate_costing_table(mf, thc_rank_params=thc_rank_params, reoptimize=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(thc_costing_table.to_string(index=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.rcParams['figure.dpi'] = 130 \n",
    "plt.plot(thc_costing_table.cutoff, np.abs(thc_costing_table.approx_energy-thc_costing_table.exact_energy), marker=\"o\")\n",
    "plt.yscale(\"log\")\n",
    "plt.xlabel(\"THC rank parameter\")\n",
    "plt.ylabel(\"MP2 Energy Error (Ha)\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Other Considerations\n",
    "\n",
    "\n",
    "### Spin polarization\n",
    "The discussion so far assumed we were performing a RHF calculation to generate spin-free integrals and a closed-shell mean field solution. While unrestricted solutions are not supported, the resource estimation code will work for ROHF solutions. This can be helpful if the system of interest has a different number of alpha and beta electrons.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ase.build import bulk\n",
    "import numpy as np\n",
    "\n",
    "from pyscf.pbc import gto, scf\n",
    "from pyscf.pbc.tools import pyscf_ase\n",
    "# Build a 2 atom unit cell for carbon in the diamond structure near it's\n",
    "# equilibrium lattice constant.\n",
    "ase_atom = bulk(\"C\", \"diamond\", a=3.5)\n",
    "cell = gto.Cell()\n",
    "cell.atom = pyscf_ase.ase_atoms_to_pyscf(ase_atom)\n",
    "cell.a = ase_atom.cell[:].copy()\n",
    "# Using a minimal basis set for expediency.\n",
    "cell.basis = \"gth-szv\"\n",
    "cell.pseudo = \"gth-hf-rev\"\n",
    "cell.verbose = 0\n",
    "cell.spin = 2  # Force high spin for ROHF\n",
    "cell.build()\n",
    "\n",
    "# We are using a very small k-point mesh for speed purposes too.\n",
    "kmesh = [1, 1, 3]\n",
    "kpts = cell.make_kpts(kmesh)\n",
    "num_kpts = len(kpts)\n",
    "mf = scf.KRHF(cell, kpts).rs_density_fit()\n",
    "mf.kernel()\n",
    "print(\"SCF energy: \", mf.e_tot)\n",
    "print(mf.cell.nelec)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cutoffs = np.logspace(-1, -5, 5)\n",
    "df_costing_table = df.generate_costing_table(mf, cutoffs=cutoffs)\n",
    "print(df_costing_table.to_string(index=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### Threshold convergence\n",
    "\n",
    "So far we have used MP2 to monitor the convergence of the factorization, however we can instead use CCSD as a better model chemistry."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "thresholds = np.logspace(-1, -5, 5)\n",
    "sparse_costing_table = sparse.generate_costing_table(mf, thresholds=thresholds, energy_method=\"CCSD\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(sparse_costing_table.to_string(index=False))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "kpoint_eri",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
